Exploit keyword query semantics and structure of data for effective XML keyword search

نویسندگان

  • Khanh Nguyen
  • Jinli Cao
چکیده

Keyword search is a natural and user-friendly mechanism for querying XML data in information systems and Web based applications. One of the key tasks is to identify and return meaningful fragments as results, due to the limited expressiveness and the ambiguity of keyword queries. In this paper, we first studied query keyword patterns in order to exploit the user’s search intention behind the input keywords. The outcome of this task is that keywords in the query are classified as required information and search conditions (or predicates). In addition, unlike previous work that our work only returns desired fragments as results. Each returned result must satisfy the search conditions rather than simply contain all query keywords. To further prune irrelevant fragments we introduce a novel notion called Relevant Lowest Common Ancestor (RLCA) which effectively and precisely captures the meaningful and relevant fragments to the given keyword query. We conducted extensive experimental studies to prove the effectiveness of our approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

Processing XML Keyword Search by Constructing Effective Structured Queries

Recently, keyword search has attracted a great deal of attention in XML database. It is hard to directly improve the relevancy of XML keyword search because lots of keyword-matched nodes may not contribute to the results. To address this challenge, in this paper we design an adaptive XML keyword search approach, called XBridge, that can derive the semantics of a keyword query and generate a set...

متن کامل

Keyword Search in Bibliographic XML Data

Keyword search is a user-friendly way to query text, HTML, XML documents and even relational databases. The previous well-known semantic of LCA (Lowest Common Ancestor) is used for XML keyword search based on tree model. However, LCA cannot exploit the information in ID references, thus may return a large tree containing irrelevant results. Another keyword search approach based on general digra...

متن کامل

ICRA: Effective Semantics for Ranked XML Keyword Search

Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, technique...

متن کامل

XIOTR : A Terse Ranking of XIO for XML Keyword Search

The emergence of the Web has increased interests in XML data because that XML has flexible structure. Keyword search has attracted a great deal of attention for retrieving XML data because it is a userfriendly mechanism. But Keyword search is hard to directly improve search quality because lots of keyword-matched nodes may not contribute to the results. And in many applications, the goal is to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010